Building Domain-Speci c Search Engines with Machine Learning Techniques
نویسندگان
چکیده
Domain-speci c search engines are growing in popularity because they o er increased accuracy and extra functionality not possible with the general, Web-wide search engines. For example, www.campsearch.com allows complex queries by age-group, size, location and cost over summer camps. Unfortunately these domain-speci c search engines are di cult and timeconsuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domain-speci c search engines. We describe new research in reinforcement learning, information extraction and text classi cation that enables e cient spidering, identifying informative text segments, and populating topic hierarchies. Using these techniques, we have built a demonstration system: a search engine for computer science research papers. It already contains over 50,000 papers and is publicly available at www.cora.justresearch.com.
منابع مشابه
A Machine Learning Approach to Building Domain-Speci c Search Engines
Domain-speci c search engines are becoming increasingly popular because they o er increased accuracy and extra features not possible with general, Web-wide search engines. Unfortunately, they are also di cult and timeconsuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domain-speci c search engines. We describe new...
متن کاملA Machine Learning Approach to Building Domain-Specific Search Engines
Domain-specific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with general, Web-wide search engines. Unfortunately, they are also difficult and timeconsuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domain-specific search engines. We describ...
متن کاملBuilding Domain-Specific Search Engines with Machine Learning Techniques
Domain-specific search engines are growing in popularity because they offer increased accuracy and extra functionality not possible with the general, Web-wide search engines. For example, www.campsearch.com allows complex queries by age-group, size, location and cost over .summer camps. Unfortunately these domain-specific search engines are difficult and timeconsuming to maintain. This paper pr...
متن کاملKeyword Spices: A New Method for Building Domain-Specific Web Search Engines
This paper presents a new method for building domain-specific web search engines. Previous methods eliminate irrelevant documents from the pages accessed using heuristics based on human knowledge about the domain in question. Accordingly, they are hard to build and can not be applied to other domains. The keyword spice method, in contrast, improves search performance by adding domain-specific k...
متن کاملSearching the Web: learning based techniques
Searching and retrieving information from the Web poses new issues that can be e ectively tackled by applying machine learning techniques. In particular, the fast dynamics of the information available on the Internet requires new approaches for indexing; the huge amount of available data is hardly manageable by humans and, on the other hand, can provide large sets of examples for learning algor...
متن کامل